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ABSTRACT 

Objective Understanding population-level health trends 
is essential to effectively monitor and improve public 
health. The Office of the National Coordinator for Health 
Information Technology (ONC) Query Health initiative is a 
collaboration to develop a national architecture for 
distributed, population-level health queries across diverse 
clinical systems with disparate data models. Here we review 
Query Health activities, including a standards-based 
methodology, an open-source reference implementation, 
and three pilot projects. 

Materials and methods Query Health defined a 
standards-based approach for distributed population health 
queries, using an ontology based on the Quality Data 
Model and Consolidated Clinical Document Architecture, 
Health Quality Measures Format (HQMF) as the query 
language, the Query Envelope as the secure transport layer, 
and the Quality Reporting Document Architecture as the 
result language. 

Results We implemented this approach using Informatics 
for Integrating Biology and the Bedside (i2b2) and hQuery 
for data analytics and PopMedNet for access control, 
secure query distribution, and response. We deployed the 
reference implementation at three pilot sites: two public 
health departments (New York City and Massachusetts) 
and one pilot designed to support Food and Drug 
Administration post-market safety surveillance activities. 
The pilots were successful, although improved cross- 
platform data normalization is needed. 
Discussions This initiative resulted in a standards-based 
methodology for population health queries, a reference 
implementation, and revision of the HQMF standard. It also 
informed future directions regarding interoperability and 
data access for ONC's Data Access Framework initiative. 
Conclusions Query Health was a test of the learning 
health system that supplied a functional methodology and 
reference implementation for distributed population health 
queries that has been validated at three sites. 
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BACKGROUND AND SIGNIFICANCE 

The Institute of Medicine has developed a vision 
for a learning health system (LHS), which will inte- 
grate the nation's electronic healthcare components 
to share and learn from each other. 1 An initial step 
toward this is the Meaningful Use Incentive 
Program (MU), which encourages the adoption and 
use of electronic health systems. This is laying 
groundwork for LHS, which will cross organiza- 
tional boundaries for tasks such as comparative 
effectiveness research, population health surveil- 
lance, and dissemination of evidence-based medi- 
cine. As the Institute of Medicine describes it: 'The 
increased complexity of health care requires a sus- 
tainable system that gets the right care to the right 



people when they need it, and then captures the 
results for improvement. The nation needs a 
healthcare system that learns.' 2 

The Office of the National Coordinator for Health 
Information Technology (ONC) has embraced LHS 
in their strategic plan. 3 As a large-scale test of LHS 
functionality, they launched the Query Health initia- 
tive in September 2011. Query Health is a public- 
private collaboration to develop standards and ser- 
vices to enable distributed, secure, standards-based 
population health measurement. 4 This capability to 
measure population-level health trends is essential to 
public health. 

The ONC embraced a distributed query model in 
Query Health, which eliminates centralization of 
data by 'bringing questions to the data.' Individual 
organizations process queries and disclose only the 
minimum necessary information to answer the query 
— often aggregate statistics — avoiding many privacy 
and security concerns. This federated approach — 
which requires deep content and system knowledge 
of the contributing health systems 5 — is nonetheless 
being used effectively for research, cohort selection, 
and population health surveillance. 6-14 

Over the past 2 years, Query Health has devel- 
oped a methodology and a flexible, open-source 
reference implementation. 15-24 We have piloted the 
implementation at several locations and collected 
feedback, which is guiding future work on a 
national scale. 25 26 

OBJECTIVE 

Query Health was tasked with three major goals. 
First, to define a methodology for distributed, secure, 
standards-based clinical queries. Existing standards 
were used wherever possible. Second, to develop a 
reference implementation using best-of-breed techno- 
logical components. 27 Third, to implement compo- 
nents of this reference implementation at three pilot 
sites, to gauge the effectiveness of Query Health in 
real-world healthcare scenarios. 22 Two of the pilots 
were in cooperation with Departments of Health 
(New York City and Massachusetts) for disease moni- 
toring and surveillance. The third pilot focused on 
the potential to expand the data resources available 
for medical product safety surveillance through the 
Food and Drug Administration (FDA) Mini-Sentinel 
project. 

MATERIALS AND METHODS 

The general Query Health workflow is as follows: 
(1) investigators develop 'questions' to ask the data 
using a standard ontology and query format; (2) 
the question is securely distributed through a 
'query envelope' to participating data partners; (3) 
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data partners process the query and return aggregate results; (4) 
results are combined and reported back to the investigator 
(figure 1). 

Ontology 

The Query Health ontology uses the Consolidated Clinical 
Document Architecture (C-CDA) to instantiate a hierarchical ter- 
minology based on the National Quality Forum's Quality Data 
Model (QDM). 15 These standards are already required as part of 
the 2014 certification for stage 2 of MU, so implementers are 
encouraged to use these standards instead of deriving new models. 
C-CDA is used to produce reports of patient data, and QDM 
defines data elements required for clinical quality measures 
(CQMs). 29 30 The reason for using this approach is summarized 
well in the 2014 Electronic Health Record (EHR) Certification 
Final Rule: 'this standard provides, for the first time, a method of 
moving a 'snapshot' of patient data from one EHR technology to 
another without loss of semantic integrity.' 31 

Query format 

The Health Level 7 (HL7) Health Quality Measures Format 
(HQMF) is Query Health's 'question language'. HQMF is an 
XML standard for platform-neutral clinical queries, which 
already has national focus and adoption because of MU. The 
National Quality Forum and the Centers for Medicare and 
Medicaid Services are using it and have released many of their 
CQMs in HQMF format. 32 33 It is possible that HQMF will 
appear in the MU stage 3 requirements. 

Despite the national attention, HQMF had inadequate com- 
putability in 2011. Query Health worked with HL7 to develop 
a second revision of HQMF that balances the flexibility needed 
by query developers and the computational tractability needed 
in implementations. This revision will be available through HL7 
shortly and will be used for future HQMF-based CQMs. 34 

In addition to the new HQMF format, a forthcoming HQMF 
QDM implementation guide will enumerate implementation 
details of using the Query Health ontology with HQMF. 35 



Query transport 

The Query Health Query Envelope standard supports secure 
transport of queries and results through a distributed network. 
It provides very granular control to data partners to authorize 
or decline data release, and it is independent of query and result 
formats. It is a flexible, secure transport mechanism. 36 

Results format 

The HL7 Quality Reporting Document Architecture (QRDA) is 
Query Health's 'result language'. QRDA is another platform- 
neutral XML language based on the HL7 Reference 
Implementation Model and is already required for quality 
reporting in MU stage 2. 30 

Evaluation 

We implemented this methodology as a flexible reference imple- 
mentation using adaptable, best-of-breed, open-source technolo- 
gies. Figure 2 is a summary of this implementation. We then 
piloted the reference implementation at three sites using differ- 
ent components for each use case. We collected feedback on the 
implementation and pilot experiences. Table 1 is a summary of 
this evaluation. 

RESULTS 
Implementation 

PopMedNet 

For query transport, PopMedNet was selected. PopMedNet is an 
open-source distributed data-sharing platform funded by the 
Agency for Healthcare Research and Quality, the FDA, ONC, and 
the National Institutes of Health (NIH). 16 37 It is a key component 
of several large-scale distributed networks, including the FDA 
Mini-Sentinel, the NIH Health Care System Research 
Collaboratory Distributed Research Network, and the 
Massachusetts Department of Public Health MDPHnet system. 9-11 
PopMedNet will be used by the newly funded Patient-Centered 
Outcomes Research Institute (PCORI) National Patient-Centered 
Clinical Research Network (PCORnet) to help create and operate a 
'network-of-networks' to support clinical research. 38 PopMedNet 




Query Data Partners Using 

Composer QRDA Response Quer v Health Data Model 



Figure 1 Overall design of Query Health. Various stakeholders can develop queries, which are distributed securely and sent to a variety of data 
partners. These data partners process the queries and return aggregate counts, so that sensitive data never leave individual sites. Query Health uses 
a variety of standards: a Query Envelope, a Data Model, Health Quality Measures Format (HQMF), and Quality Reporting Document Architecture 
(QRDA). 
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Figure 2 The technologies used in 
the Query Health reference 
implementation. Individual pilots 
varied in their use of these 
components (see Table 1). Informatics 
for Integrating Biology and the Bedside 
(i2b2) is used as a graphical query 
composer. PopMedNet is the query 
distribution and authentication engine. 
i2b2 and hQuery are the back-end data 
warehouses used by data partners to 
connect to Query Health. A Health 
Quality Measures Format (HQMF) is 
used to communicate queries in a 
standardized format. 
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defines a network topology, manages access controls, distributes 
queries to participating partner sites for local execution, and 
manages the query response. It is agnostic to query, response, and 
data formats, making it ideal for integration with disparate systems. 
The PopMedNet architecture enables synchronous or asynchron- 
ous distributed querying, enables partners to opt-in to each query 
or type of query (via an access control layer), and allows partners to 
use to their internal workflow for query response. 

Each PopMedNet query includes query metadata that we 
extended to conform to the Query Health Query Envelope stan- 
dards. Each query is defined at a central PopMedNet portal, 
securely distributed using the Query Envelope, 'unpacked' at the 
participating site, and processed locally by a Data Mart Client 
program that manages the local execution and response. The 
query response is also packaged into the Query Envelope for 
return to the requester via PopMedNet. 

i2b2 

For query composition and processing, Informatics for Integrating 
Biology and the Bedside (i2b2) was selected. i2b2 is an open- 
source clinical data analytics platform funded by the NIH and 
used at over 100 sites nationwide. 39 40 It provides an intuitive, 
graphical, web-based query builder as well as a flexible analytical 
database design. Its component-based architecture makes it easily 
adaptable to new-use cases, and it has already been used in 
another distributed query platform called SHRINE, the Shared 
Health Research Information Network. 6 Over a third of PCORI's 



PCORnet is currently using i2b2, including both Clinical Data 
Research Networks and Patient-Powered Research Networks. 
Harvard's PCORnet Clinical Data Research Network involves the 
interoperability of i2b2 at 10 health systems. 38 41 

i2b2 is a set of web service components, known as cells, that 
collectively make up a 'hive'. It is possible to add and remove 
components for different use cases of i2b2. To Query 
Health-enable i2b2, we created three new cells: 

► A PopMedNet client adapter, which sends investigator- 
developed queries to the PopMedNet web portal. 

► A PopMedNet server adapter, which receives queries from 
and sends results back to PopMedNet. 

► An HQMF translator, able to process HQMF revision 2. 
These cells integrate into the hive. This is described below 

and shown in figures 3 and 4. The i2b2 platform and these cells 
are open-source. 15 17 

When an investigator develops a query in the Query Health 
i2b2 query builder, the query is sent to the PopMedNet Client 
Adapter rather than the local data repository. This Adapter then 
transmits the query (which is in an i2b2-compliant XML 
format) to the HQMF translator, and it sends the resulting 
HQMF query onward to the PopMedNet portal. The query 
builder then displays results as they arrive in the 'previous 
queries' window (figure 3). 

When a Query Health-enabled i2b2 system is sent an HQMF 
query by the PopMedNet Data Mart Client, it is routed to the 
PopMedNet Server Adapter, which similarly sends the query to 



Table 1 Summary of the three Query Health pilots 



Pilot site 



Goal and feedback 



Technology 



Status 



New York City Department of 
Health and Mental Hygiene 



FDA Mini-Sentinel 



MDPHnet 



Initial goal: Demonstrate standards-based vendor-neutral distributed 

analytics solution using Query Health Standards 

Current goal: Launch health information exchange-based solution to 

obtain aggregate city-wide healthcare statistics 

Feedback: Standards-based aggregate distributed analytic solutions are 

now capable of delivering valuable results. Challenges remain in cross-site 

data harmonization 

Goal: Expand medical product safety surveillance capabilities to i2b2 
clinical data sources 

Feedback: Clinical data sources can provide important additional data for 
medical product safety surveillance. Resources required for data 
normalization and maintaining additional software at data partners must 
be carefully considered 

Goal: Implement Query Envelope security and authentication in existing 
public health surveillance network 

Feedback: The Query Envelope enhancements provide standards-based 
approach for distribution of queries and return of results 



Composer: \2b2 
Envelope: PopMedNet 
Processor: \2b2 
Other: HQMF, Ontology 



Composer: \2b2 
Envelope: PopMedNet 
Processor: \2b2 
Other: HQMF 



Composer: PopMedNet 
Envelope: PopMedNet 
Processor: EHR Support 
for Public Health (ESP) 
Other: (n/a) 



Initial pilot: Complete. 
Second phase: Q2 2014 



Three-month trial complete 



Successful, incorporated into 
subsequent PopMedNet software 
releases 
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Figure 3 The Informatics for 
Integrating Biology and the Bedside 
(i2b2) Query Composer. Queries are 
composed in the graphical i2b2 query 
builder using the Query Health 
Consolidated Clinical Document 
Architecture (C-CDA) data model. 
These queries are sent to the 
PopMedNet (PMN) client adapter, 
which translates the query into a 
Health Quality Measures Format 
(HQMF) and sends it to the 
PopMedNet portal for distribution. 
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the HQMF translator for translation back into i2b2 format. The 
Adapter then sends the query to the standard set of i2b2 ser- 
vices for processing. When processing is completed, the Adapter 
passes the results back to the PopMedNet Data Mart Client. We 
did not implement QRDA translation in this version of the ref- 
erence implementation — results are transmitted in i2b2 XML 
format. 

Details on the HQMF translator service have been described 
elsewhere. 42 We found that HQMF must be tightly constrained 
to ensure it represents a functional, automatable query that can 
be represented within a set of data structures. Implementation 
guides and harmonization efforts within HL7 will improve this 
process, but the complexity of existing CQMs might prove com- 
putationally problematic. 

The HQMF translator relies on an implementation of the 
Query Health data ontology, which we implemented using the 
i2b2 ontology cell. This makes extensive use of 'modifiers', which 
were introduced to i2b2 in 2012. They allow additional informa- 
tion about core data elements, such as medication route and dose, 
by storing attributes about individual patient observations. We 



found that they are a very powerful and efficient way of adding 
metadata to observations in star schema databases. 



hQuery 

An alternative data source for query processing was also selected: 
the hQuery Gateway, developed by MITRE. hQuery is a document 
database that uses JavaScript-based map-reduce queries to effi- 
ciently search Continuity of Care Documents (CCDs). 43 hQuery 
was not used in a formal pilot, but demonstrating that two very 
different data sources can both process HQMF makes a strong 
case for the interoperability of HQMF. Funding for hQuery ceased 
in 2012 and it is no longer officially supported, but Scoop Health 
is developing an alternate version of hQuery. 

We developed two new components for hQuery: 

► A module to convert HQMF queries into native hQuery 
JavaScript queries, able to process both HQMF revision 1 
and 2. 18 

► Integration with PopMedNet to enable hQuery to accept 
queries from and return the results to the sender. 



Figure 4 The Informatics for 
Integrating Biology and the Bedside 
(i2b2) Query Processing Engine. At 
each data partner using i2b2, the 
PopMedNet Data Mart Client sends a 
Health Quality Measures Format 
(HQMF) query to an i2b2 instance with 
a PopMedNet (PMN) server adapter, 
which translates the query into i2b2 
format for execution. Results are 
returned in i2b2 XML format to the 
Data Mart Client by way of the server. 
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When a Query Health-enabled hQuery Gateway receives an 
HQMF query via PopMedNet, it is converted into the native 
JavaScript query format, executed, and results are returned to 
PopMedNet. Results also do not use QRDA format in this 
version; instead they are presented in the native hQuery JSON 
format. 

We experienced similar difficulties with the complexity of 
HQMF in implementing the translator to JavaScript. However, 
the native map-reduce-based query format used by hQuery did 
eventually permit translation of fairly complex queries. The 
translator was used in development and testing of the new 
HQMF standard and is now being used at Research And 
Development (RAND) Corporation to develop an HQMF to 
SQL translator that will be applicable to multiple relational data- 
base systems. 

Pilots 

New York City Department of Health 

The New York City Department of Health and Mental Hygiene 
(NYCDOHMH) currently has a network of over 650 primary 
care practices that respond to distributed queries. This enables 
their teams of quality improvement specialists to provide regular 
feedback to the practices in order to respond to public health 
concerns for both chronic and acute conditions. 7 However, 
their current system architecture uses proprietary technology 
that is vendor-specific and not standards-based. Therefore, the 
Department is adopting the Query Health platform to add add- 
itional practices and health information exchange organizations 
that are using a wide variety of health information systems. 

In 2012, the NYCDOHMH conducted a small pilot of Query 
Health across three eClinicalWorks EHR practices' test systems. 
In the pilot, a public health investigator would develop a query 
in a central i2b2 query builder. This was transmitted to the 
PopMedNet portal, which distributed the query to the three 
participating partner sites (Data Marts). Once the query reached 
and was accepted by a Data Mart, a local i2b2 instance executed 
the query against the i2b2 data repository. This i2b2 data reposi- 
tory contained a Query Health data model populated with EHR 
test patients. The practice's result was then transmitted back 
through PopMedNet, which aggregated the results across prac- 
tices for viewing in the i2b2 query builder. 

This pilot found that, like their existing distributed network, 
Query Health's simple aggregate counts can successfully provide 
valuable cross-practice insight with the added benefit of being a 
standards-based technology solution. In October 2013, 
NYCDOHMH completed a prototype test of a Query 
Health-based solution for analyzing the data found within the 
New York statewide health information exchange (SHIN-NY). In 
this solution, an aggregate patient CCD is retrieved across the par- 
ticipating exchange partners and loaded into the i2b2 C-CDA 
system for aggregate querying. Full production release of this 
system is scheduled for early 2014. It is anticipated that it will 
provide significant insights into the quality of care delivered par- 
ticularly in the inpatient settings, which will complement the exist- 
ing primary care setting solution. Using both systems together will 
also permit comparisons of Query Health's accuracy, speed, stabil- 
ity, and capabilities. It is worth noting that despite the use of stan- 
dards, we anticipate that significant work reconciling the various 
implementations of the data elements and value sets across institu- 
tions will remain. 

FDA 

The FDA's Mini-Sentinel project aims to monitor medical product 
safety using electronic health data. 44 Mini-Sentinel is using Query 



Health standards implemented through PopMedNet for distribu- 
ted querying and receiving results within a network of over 130 
million individuals. 9 The Query Health pilot investigates adding 
i2b2 clinical data repositories to the Mini-Sentinel network to 
expand the medical product safety monitoring capabilities of the 
network. The FDA Mini-Sentinel Operations Center team at 
Harvard Pilgrim Health Care Institute (HPHCI) partnered with 
Beth Israel Deaconess Medical Center (BIDMC) and Lincoln Peak 
Partners (the PopMedNet technology partner) to implement the 
pilot. The i2b2 implementation at BIDMC was the target of the 
query. 

The overall architecture of this pilot was very similar to the 
New York City pilot — using i2b2, HQMF, and PopMedNet. 
Also, BIDMC used an i2b2 structure compatible with the Query 
Health data model. Therefore this pilot's implementation of the 
reference implementation was fairly straightforward. However, 
BIDMC did not have resources to host the PopMedNet Data 
Mart Client software locally. Therefore a new network architec- 
ture was developed in which the Data Mart Client was installed 
securely in the cloud rather than behind the firewall at BIDMC, 
while the i2b2 data remained behind the BIDMC firewall. 
Queries were delivered to BIDMC via a secure virtual private 
network tunnel. This architecture avoided the needed to install 
the PopMedNet software within BIDMC, but still gave BIDMC 
the ability to review queries before execution and review 
responses before release. Because only aggregate counts passed 
through the tunnel, BIDMC viewed this architecture as no more 
of a security risk than if the software were installed behind their 
firewall. This pilot was time-limited based on the agreement 
between HPHCI and BIDMC and was shut down after approxi- 
mately 3 months of running successfully, but was memorialized 
in a video available on YouTube. 45 The pilot demonstrated the 
feasibility of connecting the FDA Mini-Sentinel network to an 
i2b2 end point, which could be used to expand Mini-Sentinel to 
include data from the dozens of healthcare centers that already 
have their clinical data in i2b2 format. 

MDPHnet 

MDPHnet is an ONC-funded project overseen by the 
Massachusetts eHealth Institute in collaboration with the 
Massachusetts Department of Public Health. MDPHnet is a 
population-based EHR surveillance network targeting a broad 
array of health indicators across multiple providers and delivery 
systems. The project integrates two software systems — 
PopMedNet and EHR Support for Public Health (ESP) 46 — into 
a single platform (ESPnet) for population-based surveillance 
using EHR data. 47 The Query Health platform is enabling 
secure, standardized queries on the same architecture. This pilot 
focused on fully implementing the Query Envelope standard, to 
demonstrate its flexibility and granular security control. 

MDPHnet was already being deployed before Query Health 
was initiated. This pilot's goal was a technical one only: deploy- 
ment of a Query Health-compatible version of PopMedNet. 
This pilot was deployed successfully within MDPHnet and is 
still being used. The Query Envelope and associated query meta- 
data capture developed as part of this pilot is now integrated 
fully into the PopMedNet software package and is being used 
by all PopMedNet-based networks. 

DISCUSSION 

Query Health is a powerful test of the nation's progress toward 
an LHS. The initiative has been quite successful: Query Health 
has created a vendor-neutral, standards-based approach for dis- 
tributed population health queries; we have delivered an open- 
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source reference implementation with several alternative config- 
urations; we have co-developed a new revision of HQMF; and 
we have validated our system for three very different use cases. 
This work demonstrates enthusiastic collaboration for LHS 
initiatives across several research groups, government entities, 
and clinical practices. 15 17 18 45 

Query Health also uncovered a challenge in our electronic 
healthcare architecture: most clinical systems, unlike those used 
in the pilots, do not use common data models. To address this 
concern, the ONC has launched the Data Access Framework 
(DAF), an initiative to provide standards and implementation 
guidance for cross-platform normalized data access. DAF is a 
multilevel initiative that will encompass standards for both 
intra- and inter-organizational queries on both individuals and 
populations. This work is a challenging but necessary step 
before national-scale distributed networks such as Query Health 
can become viable on a large scale. 

In the meantime, components of and lessons learned from 
the Query Health initiative are being used for new initiatives 
from the reference implementation team. The NIH Health Care 
System Research Collaboratory is developing new distributed 
networks powered by PopMedNet, including a deeper integra- 
tion of PopMedNet and i2b2. A pilot recently demonstrated 
translation of an MDPHnet query to execute directly against an 
i2b2 database. Both PopMedNet and i2b2 will also be used by 
portions of PCORFs PCORnet. The i2b2 community can con- 
tinue to expand HQMF support as federal guidelines for its use 
are finalized. The i2b2 implementation of the Query Health 
ontology is being leveraged to develop CCD import tools, 
which will allow i2b2 to be deployed more broadly. The 
New York City pilot is continuing to integrate regional health 
information exchanges into their Query Health network. 

CONCLUSION 

Query Health has created a vendor-neutral, standards-based 
approach for distributed population health queries. As Query 
Health evolves, on-the-fly translation between HQMF and local 
formats may allow interoperability among systems, creating an 
infrastructure for comprehensive population health queries. 
Lessons learned from the Query Health experiment are inform- 
ing the ONC's DAF, which will encourage data availability for 
future cross-platform use cases. The present reference imple- 
mentation provides a common set of components for distributed 
population health queries that has been validated at three sites. 
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